The RBSE Spider - Balancing Effective Search Against Web Load1
نویسنده
چکیده
The design of a Web spider entails many things, including a concern for reasonable behavior, as well as more technical concerns. The RBSE Spider is a mechanism for exploring World Wide Web structure and indexing useful material thereby discovered. We relate our experience in constructing and operating this spider.
منابع مشابه
The RBSE Spider - Balancing Effective Search Against Web Load
The design of a Web spider entails many things, including a concern for reasonable behavior, as well as more technical concerns. The RBSE Spider is a mechanism for exploring World Wide Web structure and indexing useful material thereby discovered. We relate our experience in constructing and operating this spider.
متن کاملData Partition and Job Scheduling Technique for Balancing Load in Cluster of Web Spiders
Search engines primary rely on web spiders to collect large amount of data for indexing and analysis. Data collection can be performed by several agents of web spiders running in parallel or distributed manner over a cluster of workstations. This parallelization is often necessary in order to cope with a large number of pages in a reasonable amount of time. However, while keeping a good paralle...
متن کاملLoad Balancing Approaches for Web Servers: A Survey of Recent Trends
Numerous works has been done for load balancing of web servers in grid environment. Reason behinds popularity of grid environment is to allow accessing distributed resources which are located at remote locations. For effective utilization, load must be balanced among all resources. Importance of load balancing is discussed by distinguishing the system between without load balancing and with loa...
متن کاملWeb-Collaborative Filtering: Recommending Music by Spidering The Web
We show that it is possible to collect data that is useful for collaborative filtering (CF) using an autonomous Web spider. In CF, entities are recommended to a new user based on the stated preferences of other, similar users. We describe a CF spider that collects from the Web lists of semantically related entities. These lists can then be used by existing CF algorithms by encoding them as "pse...
متن کاملParallel Web Spiders for Cooperative Information Gathering
Web spider is a widely used approach to obtain information for search engines. As the size of the Web grows, it becomes a natural choice to parallelize the spider’s crawling process. This paper presents a parallel web spider model based on multi-agent system for cooperative information gathering. It uses the dynamic assignment mechanism to wipe off redundant web pages caused by parallelization....
متن کامل